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COMPOSITIONS AND METHODS FOR PROTEIN 
PURIFICATION BASED ON A METAL ION AFFINITY SITE 



BACKGROUND OF THE INVENTION 



Cross Reference to P *1att»ri Amplications 

This application claims benefit of non-provisional 
15 application US Serial number 09/078.687 filed May 14, 1998. 

T^IH nf the Invention 

This invention relates generally to the field of protein 
chemistry. Specifically, the present invention relates to protein 
20 purification using a metal ion affinity site. 

Background of the Tnvention: 

Development of protocols for the isolation and 
purification of proteins is often a long and costly process. Such 
25 protocols usually contain multiple steps, where some of the steps 
have recoveries as low as 50%. Further, due to variation between 
protein molecules, a purification protocol developed and effective 
for the purification of one protein is not necessarily useful for the 
purification of another. In fact, in most cases considerable 
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adaptations must be made to a purification protocol to 
accommodate the various physical and chemical characteristics of 

different proteins. 

The ability to prepare hybrid genes by genetic 
5 engineering technology has opened up. new possibilities for the 
purification of proteins. For example, one can link a DNA 
sequence of a protein of interest to a nucleic acid sequence which 
codes for a peptide which has a high binding affinity for a specific 
ligand. The fusion protein product resulting from expression of 
10 this DNA has attributes of both the protein of interest and the 
high affinity peptide. To purify or immobilize the engineered 
fusion protein, the ligand commonly is linked to a support, and the 
unpurified, engineered protein is then exposed to the 
ligand/support composite and allowed to bind. 
15 ~ There are numerous advantages of using a high 

affinity fusion protein. For example, the use of an affinity peptide 
ensures that no part of the native protein of interest is involved in 
adsorption-the binding between the fusion protein and the ligand. 
At the same time, extremely high selectivity in the adsorption 

20 process is achieved. 

Immobilized Metal Ion Affinity Chromatography 
(IMAC) is one of the most frequently used techniques for 
purification of fusion proteins containing affinity sites for metal 
ions. Proper choice of immobilized metal ion, loading conditions 
25 and elution conditions can result in protein purification of up to 
about 95-98% in a single chromatographic step. Moreover, 
recovery generally is higher than 85%. In addition to the 
advantages discussed above, incorporation of a proteolytic, 
chemical, or enzymatic cleavage site into the composite DNA, 
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between the affinity peptide and the sequence of the protein of 
interest, provides a means for c.eaving the affinity peptide front 
tbe protein of interest to yieid the native protein o, interest tn 

highly purified form. 
5 The following publications are representee of the 

art- hakura, e, al.. Science ,98:1056-63 (1977); Germino et a,„ 

PNAS USA 80:6848-52 (1983): Nilsson e, al., Nucleic Acid Res. 

,3:1151-62 (1985); Smith e, a,.. Gene 32:32,-27 (1984); 

a,., U.S. Pat. No. 5,284,933; and Dobeli, e. al., U.S. Pat. No. 
10 5,310,663. 

The prior art is deftcieint in improved composmons 
and methods for affinity immobilization and purification of 
proteins. This invention fulfills this long-felt need in the art. 
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SUMMARY OF THE INVENTION 



The present invention relates ,0 compositions and 
methods for protein purification involving the use of novel, 
20 genetically-engineered fusion proteins. These fusion protetns are 
engineered to al,ow for immobilization and purificat.on vta 
bigh affinity interaction of an affinity peptide of a fusion protein 
with a ligand. The affinity peptide is a histidine-rich polypepttde 
sequence with a genera, sequence: (HX„)„, wherein H ts h.strd.ne 
25 X is an amino acid other than histidine, n= 1-8, m= 2-30, and 
wherein if n=l for more than two adjacent unirs of HX, at leas, one 
X must be asparagine, phenylalanine, tryptophan, tyrosine, lys.ne, 
methionine, arginine, g.utamtne, or cysteine. The affinity pepttde 
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is linked to .he proteins of interest Rl and R2 to yield a fusion 
protein with formula Rl-(HX„> m -R2. In a preferred embodiment of 
th e invention, -1-4 and m=2-10. In a more preferred 
embodiment of the invention, n=I-4 and m=3-6. In a specific 
S embodiment of the invention, a fusion protein having the 
sequence SLKDHLIHNVHKEEHAHAHNKISWGVGAVGM (SEQ ID 
No:6) is provided. In another embodiment of the present aspect 
of me invention, at leas, one protease cleavage site is inserted 
between .he sequence of .he protein of in.erest and .he sequence 

10 of the affinity peptide. 

In another aspect of the invention, there is provided a 
DNA sequence coding for a fusion protein comprising a protein of 
interest fused a, its amino-lerminus or carboity-.erminus to a. 
leas, one affinity peptide, where the fusion protein has the 
,5 general formula RHHX„)„-R2, wherein Rl or R2 is ,he pro.ein of 
interest, H is histidine, X is an amino acid Cher than histidine, n= 
,.8 m= 2-30, and wherein if n=l for more than two adjacent 
nni'ts of HX, at least one X must be asparagine, phenylalanine, 
tryptophan, tyrosine, lysine, methionine, arginine. glu.am.ne, or 
20 cysteine. In a specific embodiment of this aspect o, the invention, 
.here is provided a DNA sequence which codes for a protein where 
the fusion protein has the sequence 

SLKDHLIHNVHKEEHAHAHNKISVVGVGAVGM (SEQ ID No:6). 

In various embodiments of this aspect of the 
* invention, there is provided a recombinant vector comprising an 
expression vector and a DNA sequence coding for a fusion protem 
comprising a protein of interest fused at its amino-term.nus or 
carboxy-terminus to at leas, one affinity peptide as described 
above, wherein the recombinant vector is capable of direc.rng 
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expression of the DNA sequence in a suitable host organism. The 
present invention also provides a host organism contarnmg a 
recombinant expression vector comprising DNA sequence coding 
for a fusion protein comprising a protein of interest fused a. . s 
, amino-tertninus or carboxy-.erminus .0 at least one a rnt y 
peptide as described above, wherein the organism ts capable 

expressing said DNA sequence. 

I„ yet an additional aspect of this mvention. there ts 
• provided a method for purifying the novel fusion proteins of the 
0 present invention, comprising .he steps of, contacting a protein 
Lp.e containing the fusion protein in a mixture w.th 
proteins with a meta, chelate resin under conditions where the 
Lion protein binds to Ore resin to produce a resin-fusion protetn 
complex; washing the resin-fusion protein complex 
15 to love the other, unbound proteins; and eluting the bound 
fusion protein from the washed resin-fusion protein complex One 

protease cleavage site between the protein of .uteres, and he 
affinity peptide, and cleaving the protein of interest from 
ao affinity peptide after purification using the metal chelate resrn. 

Other and further aspects, embodiments, features and 
advantages of *e present invention will be apparent from the 
following description of the invention. 
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BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 is a schematic representation of the 
pOFPuv/HAT vector. This vector contains one embodiment o, the 
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■a f ,h, oresent invention fused to the N-terminus 
affinity pept.de of the present 

of a UV motant of Green Fluorescent Protetn. ° y 

restriction sites are noted. 

Figure 2 is a schematic represents of the vector 
pUC1 9/HS) containing par, of the affinity peptide (AP) at the N- 
LrLnus of EnteroKnase (HK) cleavage site foUovved b y muUtple 
cloning site (MCS). Only restriction sites are denoted 

Figure 3 is a schematic represented of the 
restriction maps of a vector with three frame shifts —g 
, parr of the affinity peptide and enterokinase cleavage srte 
used for expression of recombinant protems. 

DETAILED DESCRIPTION OF THE INVENTION 

5 *„ n relates to compositions and 

The present invention relates 

m ethods for purification of novo,. -eticaU^e- - 

■ — Tan":"eTr of the fus.on 
affinity interaction of an aff mi y P P hist idine-rich 
• v. r„ an H The high affinity peptide is a nistiai 
20 protein with a hgand. The g ^ 

i™«r.de with a general sequence (HX„)» wmc 
poiypeptrde w g ^ ^ ^ amino 

protems of .merest Rl . { n=J fof 

acid other man htsUdme, n- 1 , m must ^ 

m „re than two adjacent unrts of HX, at ^ 

purification of the fnsion protein. 
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The affinity of the high affinity peptide is for 
imm obiUzed meta, ion, The strength of binding between the h.gh 
affinity peptide and an appropriate tnetai ton is very htgh^hu . 
isol ation of the fnsion proteins is very se.eet.ve. However, 
association between the peptide and the Hgand is also rever,bl. 

l ted from the metal ioo/adsozben. by addirion of 

TZ sueh as in.ida.ofe, or by decreasing the pH, which 

1 of the nitrogen in the imidazole ring of the hist.d.ne 

' I a d iea e of the adsorbed protein. Beeause o, this 

side chain and release <ji 

reversibility, the protein is reeovered in a purffied, un bound for. 
Purser, regeneration and reuse of the mera, — 
support multiple times-even more than 100 Umes-ts posstbl^ 

An additional feature of the prote.n purfficaUon and 
immobilization techniques based on the principles of the present 
"lion is the high probability that the purified and regenerated 
pi in of interest wiU retain fuU biologica 1 acfv.ty an 
peeffieiry. This is because the affinity peptide ts rnvolve n 
* immobilization/binding process where the poruon of the 
pr o«ein that contains the protein of interest is not 

Inc „rporation of a proteolytic sue between the h.gh 
affinity peptide and the sequence o, the protein of interest 
h means to regenerate the protein of interest from .the 
T ..in Regeneration is achieved by limited pro.eo.ys.s of 

:riir:iizrid'pur.fy the fusion protein, fn .e second 
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Orography step folding proteo.ysis. the cleaved 
generated protein of in.er.st passes throngh the co.umn w, bout 
JmobiUzauon or adsorption, whereas the high affinity pepude ts 

adsorbed on the column. 
, One embodiment of the present invention features 

corporation of nucleic acid sequences which code for secretion 
signals into the DNA sequence that codes for the fusion prote.n. 
Such secretion signals cause the fusion protein to be secreted tnto 
Ut, media after synthesis in a host cel.. Since a -stderab 
10 amount of tota. cellular protein remains in the ce.., secretion 
impro ves dramatical the isolation and purification of the fus.on 
protein by eliminating the need for cell disruption, prote.n 
extraction, and/or remove, of unwanted cellular components and 

affinity peptide" refer to a hisudine-r.cn po.ypeptide with a 

,,r Y s which is linked to proteins of interest ki 
general sequence (HX n ) m wnicn is iin*c 

or R2 ; wherein H is hisUdine, X is an amino acid other than 
hisfidine, n= 1-8, m= 2-30, and wherein if n=l for more than .wo 
„ uy »> least one X must be asparagine, 
20 adjacent units of HX, at least one 

. ,„„,.. lysine, methionine, arginine, 
phenylalanine, tryptophan, tyrosine, lysine, 

glutamine, or cysteine. 

As used herein, the term "protein of interest shall 

25 purpose of purification or immobilization. 

AS used herein, die term "fusion protein" shall refer <o 
*. protein hybrid containing the affinity peptide and the prote.n 
of interest or any amino acid sequence of interest. The^fusto 
protein has the genera, formula R.-(HX„)„-R2; where.n Rl or R2 

8 
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the protein of interest, H is hisridine, X is an amino acid other than 

, „mts of HX at least one X must be asparagtne, 
adjacent units 01 

u.. tvrosine lysine, methionine, arginine, 

phenylalanine, tryptophan, tyrosine, iysm 

5 glutamine, or cysteine. 

As used herein, the terms "secretion sequence or 

••secretion signa. sequence" shall refer to an amino acid signal 
sequence which leads to the transport of a protein contaimng the 
signa! sequence outside the cell membrane. In the present case, a 
„ fusion protein of the present invention may contain such a 
secretion sequence to enhance and simpHfy purification 
Representative examples of secretion signal sequences are well 
known to those having ordinary skill in this art. 

AS used herein, the term "proteolytic site" shall refer 
15 t0 any amino acid sequence recognized by any proteolytic enzyme. 
,„ the present case, a fusion protein of the present inventton may 
contain such a proteolytic sire between the protein of interest and 
the affinity peptide and/or other amino acid sequences So 
protein of interest may be separated easily from these 
20 heterologous amino acid sequences. 

As used herein, the term "metal ion" refers to any 
.netal ion for which the affinity peptide has affinity and that can 
be used for purification or immobilization of a fusion protetn. 

' As used- herein, the terms "adsorbent" or ..lid 

support" shall refer to a chromatography or immobilization 
medium used to immobilize a metal ion. 

As used herein, the term "regeneration", in the context 

. ■„ „h„ll refer to the process of separating or 
of the fusion protein, shall rerer to f 
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eliminating the affinity peptide and other heterologous amino acid 
sequences from the fusion protein to render the protein of 
interest after purification. 

So that the matter in which the above-recited features, 
advantages, and objects of the invention become clear and can be 
understood in detail, particular descriptions of the invention may 
be had by reference to particular embodiments described in the 
Examples below; however, the following description and examples • 
are given for the purpose of illustrating various, specific 
embodiments of the invention, and are not meant to limit the 
scope of the invention in any fashion. 
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EXAMPLE 1 



Extr action ^ lactate deh ydrogenase f rom rhirken breast muscle 

A naturally-occurring peptide sequence from the N- 
terminus of lactate dehydrogenase (LDH) from chicken muscle 
(Gallus gallus) was used for initial experiments. The protein 
includes a stretch of approximately 30 amino acids which has a 
sequence consistent with the general formula of the fusion protein 
of the present invention 

(SLKDHLIHNVHKEEHAHAHNKISVVGVGAVGM (SEQ ID No:6)). 
Further, LDH has the feature that the enzyme itself can be assayed 
25 easily for activity. Thus, the naturally-occurring chicken muscle 
LDH served as a "fusion protein" for these experiments in the 
sense that it contained both a high affinity peptide- and a protein 
of interest. 
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Extraction of chicken breast muscle LOT was 
performed by cntting 15 g of frozen chicken muscle, free of Mood 
vessels, into smau pieces and transferring the materia to a 
commercial blender a.ong with 150 mL of extraction buffer (50 
5 mM sodium phosphare, 1 mM EDTA, 1 mM magnesium acetate pH 
7 5 , mM 2-mercaptoerhano! (0.2 L) stored for a. leas. 3 0 
minutes a. 4°C). The mixture was homogenized twice at 4°C for 30 
seconds, with a 10-minn.e pause between the bursts. After the 
second homogenization, the mixture was transferred to centrtfuge 
,0 tubes and centrifuged at 4-C and 10,000 x g for 30 minutes. The 
dear supernatant was collected and used as a starting sample for 
the purification of lactate dehydrogenase. 
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EXAMPLE 2 
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Purification Sf 

SSBl,M2SS - E La c,ate dehydrogenase was purified by fMAC in the 

mowing manner: approximately 5 mL of Che.ating Sepharose FF 

(Amersham, Pharmacia) was transferred to a vacuum bott.e 

dil ut=d with an e,ua, volume of deionized water and degassed 

under vacuum for 10 minutes. The gel suspension was poured 

, „ nnxl cm i d.) trapped on the bottom with a 
into a column (10x1 cm. u-J rv 

degassed adapter and left to settle. The column was then filed to 
th e top with degassed deionized Water, and a top adapter was 
gently pushed down the column bed until there was no space 
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was washed with 3 column volumes of deionized water at a flow 

rate of 0.5 mL per min. 

The chicken muscle extract (14 mL) was equilibrated 
by gel filtration on Sephadex G-25 columns with equilibration 
5 buffer (20 mM sodium phosphate buffer containing 1.0 M sodium 
chloride and 0.06 M imidazole pH 7.0 (1 L)). The IMAC column 
was then charged with Ni(II) ions using 20 mL of a 0.02M 
Ni(N0 3 ) 2 solution. The excess metal was washed from the column 
with deionized water at a flow rate of 0.5 mL per minute and the 
10 column was then equilibrated with 5-10 volumes of equilibration 
buffer (20 mM sodium phosphate buffer containing 1.0 M sodium 
chloride and 0.06 M imidazole pH 7.0 (1 L)). 

The IMAC column was prepared by loading the 
equilibrated extract on to the IMAC column at a flow rate of 0.5 
15 mL per min. Fractions of 1 mL were collected. The column was 
washed with equilibration buffer until a baseline was reached 
(absorbance of the fractions at 280 nm was less than 2 mAU 
higher than the absorbance of the equilibration buffer). The 
adsorbed material was eluted with elution buffer (20 mM sodium 
20 phosphate buffer containing 1.0 M sodium chloride and 0.3 M 
imidazole pH 7.0 (0.2 L)) and absorbance at 280 nm was 
determined on a spectrophotometer. Protein content of each 
fraction was determined as described in M. Bradford, Analytical 
Biochemistry, 72 (1976) 248, and lactate dehydrogenase activity 
25 was determined as described in F. Kubowitz and P. Ott, Biochem. Z, 
314 (1943) 94. Results indicated that more than 95% of the 
lactate dehydrogenase activity was recovered in the elution 
fractions. 
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EXAMPLE 3 

Characterisation of lactate de hydrogenase binding 
5 Further experiments were performed both with native 

LDH, a tetramer of about 140 kD, and a subunit of the enzyme, 
obtained after warming of the crude chicken muscle extract to 
45°C for 10 minutes. Both the tetramer and the subunit were 
allowed to associate with the immobilized Ni support, and both 
10 forms of LDH were retained. This result demonstrates that the 
retention of the LDH enzyme on immobilized Ni is not peculiar to 
the tetrameric form of the peptide; that is, binding does not 
require "cooperation" between subunits. Instead, the single 
subunit of the enzyme also had affinity for the nickel ion, and this 
15 affinity was demonstrated to be virtually identical to the affinity 
shown for the tetramer. Both the native protein and the subunit 
were adsorbed in buffer with an imidazole concentration up to 6 0 
mmol and both were eluted completely at a concentration of 300 
mmol imidazole. 

20 To ascertain that it is the polyhistidine portion of the 

LDH that provides affinity for the nickel ion, the tetramic form of 
the LDH enzyme was subjected to CNBr cleavage to produce a 
mixture of peptides. This mixture of peptides was applied to a Ni- 
IDA column with metal ion capacity of 32 mmol per mL gel. 

25 Loading conditions were the same as those used for the 
purification of the enzyme from the crude extract, described 
above. The adsorbed material was eluted with 300 mmol 
imidazole, and subjected to RPC chromatography. The 
chromatographic peak containing about 80% of the adsorbed 
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material was then subjected to amino acid analysis. The results 
obtained demonstrate that this peak corresponds to the N- 
terminal peptide from LDH and that this peptide that contains the 
polyhistidine sequence. In addition, the fact that the peptide 
retained its binding affinity even after treatment with CNBr in 
presence of 70% TFA is proof that the binding is not due to a rigid 
secondary conformation structure. 
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EXAMPLE 4 



Pnrifi-n- r of Wt»t B .rfeh vrtmrr . na»r. on ro^+-T AT ,ON agarose 

Extraction of chicken breast muscle LDH was 
performed as in Example 1, and equilibrated by gel filtration on 
15 Sephadex G-25 columns with equilibration buffer (20 mM sodium 
phosphate buffer containing 1.0 M sodium chloride and 0.06 M 

imidazole pH 7.0 (1 L)). 

The IMAC column was prepared in the following 
manner: Approximately 2.75 mL of Co2 + -TALON Superflow 6 
(Amersham, Pharmacia) was transferred to a vacuum bottle, 
diluted with the same volume of deionized water and degassed 
under vacuum for 10 minutes. The gel suspension was poured 
into a column (3x1 cm. i.d.) trapped on the bottom with a 
degassed adapter and left to settle. The column was filled to the 
25 top with degassed deionized water, and a top adapter was gently 
pushed down toward the column bed until there was no space 
between the top surface of the gel and the adapter. The column 
was washed with 3 column volumes of deionized water at a flow 
rate of 0.5 mL per min. 
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Purification of the fusion protein on Co2+-TALON 
Superflow 6 was carried out by first equilibrating the IMAC 
column with 5 to 10 column volumes of the equilibration buffer. 
The sample was then loaded on the IMAC column at a flow rate of 

5 1.0 mL per min, and 1 mL fractions were collected. The column 
was washed with the equilibration buffer until a baseline was 
reached (absorbance of the fractions at 280 nm as less than 2 
mAU higher than the absorbance of the equilibration buffer). 

The adsorbed material was eluted with elution buffer 

10 (20 mM sodium phosphate buffer containing 1.0 M sodium 
chloride and 0.3 M imidazole pH 7.0 (0.2 L)) and absorbance at 
280 nm was determined on a spectrophotometer. Protein content 
of each fraction was determined as described in M. Bradford, 
Analytical Biochemistry, 72 (1976) 248, and lactate 

15 dehydrogenase activity was determined as described in R 
Kubowitz and P. Ott, Biochem. Z., 314 (1943) 94. As in Example 2, 
more than 95% of the lactate dehydrogenase activity was 
recovered in the elution fractions. 

20 

EXAMPLE 5 

Isolation and purification of fusion protein consisting of affinity 
peptide and Green Fluorescent Protein UV Mutant (GFfuy) 
25 An affinity peptide/GFP fusion protein was isolated 

from E. coli cells which had been transformed with the pGFPuv.HS 
vector. Cell paste (0.39 g) was transferred to pre-cooled mortar, 
1.2 g of alumina was added, and the mixture was ground for 2 
minutes. Extraction buffer (5 mL, stored at 4°C) was added, and, 

15 
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after additional grinding for 2 minutes, the mixture was 
transferred into four eppendorph tubes. The suspension was 
added to the eppendorph tubes and centrifuged for 12 minutes at 
12,000 rpm (11,750 x g). The clear supernatant (approximately 6 

5 mL) was used as a starting sample for IMAC. 

The extraction and chromatography equilibration 
buffers consisted of 20 mM sodium phosphate buffer containing 
1.0 M sodium chloride and 5 mM imidazole pH 7.0 (1 L). The 
elution buffer for IMAC consisted of 20 mM sodium phosphate 

10 buffer containing 1.0 M sodium chloride and 150 mM imidazole 

pH 7.0 (0.2 L). 

The IMAC was carried out in the following manner: 
Approximately 2.75 mL of Co2+-TALON Superflow 6 (Amersham, 
Pharmacia) was transferred to a vacuum bottle, diluted with the 

15 same volume of deionized water and degassed under vacuum for 
10 minutes. The gel suspension was poured into a column (3x1 
cm. i.d.) trapped on the bottom with a degassed adapter and left 
to settle. The column was filled to the top with degassed 
deionized water, and a top adapter was gently pushed down 

20 toward the column bed until there was no space between the top 
surface of the gel and the adapter. The column was washed with 
3 column volumes of deionized water at a flow rate of 0.5 mL per 
min. 

Purification of the fusion protein on Co2+-TALON 
25 Superflow 6 was carried out by first equilibrating the IMAC 
column with 5 to 10 column volumes of the equilibration buffer. 
The sample was then loaded on the IMAC column at a flow rate of 
1.0 mL per min, and 1 mL fractions were collected. The column 
was washed with the equilibration buffer until a baseline was 
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reached (absorbance of the fractions at 280 nm as less than 2 
mAU higher than the absorbance of the equilibration buffer). The 
adsorbed material was then eluted with elution buffer. 

Absorbance of each fraction at 280 nm was 
5 determined on a spectrophotometer; and protein content of each 
fraction was determined. Fluorescence of each fraction was 
determined on a microplate reader, and the purity of the fusion 
protein was determined also by SDS -electrophoresis. More than 
85% of the fusion protein was recovered in the fractions obtained. 

10 Part of the cDNA sequence, and the amino acid sequence encoded 
by this cDNA sequence, of a vector containing the affinity peptide 
- at the N-terminus of Green Fluoresecent Protein-UV mutant 
(GFPuv) is shown in SEQ ED No. 1 and SEQ ID No. 2, respectively. 
The full cDNA sequence of a vector containing the construct of the 

15 affinity peptide at the N-terminus of GFPuv is shown in SEQ ID No. 
3. The full cDNA sequence of a vector containing part of the 
affinity peptide at the N-terminus of the enterokinase cleavage 
site and the amino acid sequence encoded by this cDNA 
corresponding to the start of translation site, the affinity peptide 

20 and the multiple cloning site are shown in SEQ ID Nos. 4 and 5. 

EXAMPLE 6 

25 Construction of fusion proteins 

A DNA sequence corresponding to the affinity peptide 
of the present invention is fused to the DNA coding sequence of a 
protein of interest. The polynucleotide sequence for the affinity 
peptide is fused most generally at or close to the DNA sequence 
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coding for the N- or C-terminal amino acid of the protein of 
interest. This results in a DNA sequence which codes for a fusion 
protein comprising the affinity peptide and the protein of interest. 

In addition, a polynucleotide sequence that codes for a 
5 protein proteolytic site is incorporated into the fusion protein DNA 
sequence between the sequence for the affinity peptide and the 
sequence of the protein of interest. This type of DNA construct 
results in a fusion protein product having a proteolytic site. This 
site allows for the eventual regeneration of the protein of interest 

10 from the fusion protein by limited proteolysis and a second 
chromatography step. The second chromatography step, in which 
the product of the proteolysis is loaded onto an immobilized metal 
ion affinity column, results in the separation of the protein of 
interest from the affinity peptide. 

15 An additional embodiment of the present invention 

provides a DNA sequence coding for a polypeptide "secretion 
signal" introduced into the DNA that codes for the fusion protein. 
This secretion signal, when expressed, causes the fusion protein to 
be secreted into the culture media after the fusion protein is 

20 synthesized in the cell. Since a large number of cellular proteins 
are not transported out of the cell, isolation and purification of the 
fusion protein is^ enhanced as the requirements for cell disruption, 
extraction and removal of unwanted cell components are 
eliminated. 

25 The present invention is directed to a fusion protein of 

general formula Rl-(HX n ) m -R2 comprising: a protein of interest Rl 
or R2 fused at its amino terminus or carboxy terminus to at least 
one affinity peptide, said affinity peptide having a formula (HX n ) m , 
wherein H is histidine, X is an amino acid other than histidine, n= 
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1-8, m= 2-30, and wherein if n=l for more than two adjacent 
units of HX, at least one X must be asparagine, phenylalanine, 
tryptophan, tyrosine, lysine, methionine, arginine, glutamine, or 
cysteine. Preferably, n=l-4 and m=3-10. In one preferred 
5 embodiment, n=l-4 and m=3-6. Preferably, if n=l for more than 
two adjacent units of HX, only one X is asparagine, phenylalanine, 
tryptophan, tyrosine, lysine, methionine, arginine, glutamine, or 
cysteine. In one preferred embodiment, the fusion protein has a 
sequence SEQ ID No. 1. The fusion protein may contain at least 

10 one protease cleavage site between said protein of interest and 
said affinity peptide. Preferably, the affinity peptide has affinity 
for metal ions. A representative metal ion is a nickel ion. The 
fusion protein may further comprise a secretion signal sequence. 

The present invention is also directed to a DNA 

15 sequence coding for a fusion protein of general formula Rl- 
(HX n )m-R2 comprising; a protein of interest Rl or R2 fused at its 
amino terminus or carboxy terminus to at least one affinity 
peptide, said affinity peptide having a formula (HX n )m> wherein H 
is histidine, X is an amino acid other than histidine, n= 1-8, m= 2- 

20 30, and wherein if n=l for more than two adjacent units of HX, at 
least one X must be asparagine, phenylalanine, tryptophan, 
tyrosine, lysine, methionine, arginine, glutamine, or cysteine. 
Preferably, the fusion protein has the sequence shown in SEQ ID 
No:6. 

25 The present invention is also directed to a 

recombinant vector comprising a DNA sequence disclosed herein, 
wherein said recombinant vector is capable of directing 
expression of said DNA sequence for said fusion protein in a 
suitable host organism. The present invention is also directed to a 
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host organism containing a recombinant vector disclosed herein, 
wherein said organism is capable of expressing said fusion 
protein. 

The present invention is also directed to a method for 
5 purifying the fusion protein disclosed herein, comprising the steps 
of: 

contacting a protein sample containing said fusion protein in a 
mixture with other proteins with a metal chelate resin under 
conditions where said fusion protein binds to said resin to produce 
10 a resin-fusion protein complex; washing said resin-fusion protein 
complex with a buffer to remove said other, unbound proteins; 
and eluting said bound fusion protein from the washed resin- 
fusion protein complex; wherein said eluted fusion protein is 
purified. This method may further comprise the step of cleaving 
15 said protein of interest from said affinity peptide. Moreover, this 
method further comprises the step of separating said cleaved 
protein of interest from said affinity peptide using a. metal 
chelate resin under conditions where said affinity peptide binds to 
said metal of said resin and said protein of interest does not. 
20 Any patents or publications mentioned in this 

specification are indicative of the levels of those skilled in the art 
to which the invention pertains. These patents and publications 
are herein incorporated by reference to the same extent as if each 
individual publication was specifically and individually indicated 
25 to be incorporated by reference. 

One skilled in the art will readily appreciate that the 
present invention is well adapted to carry out the objects and 
obtain the ends and advantages mentioned, as well as those 
inherent therein. The present examples along with the methods, 

20 
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procedures, treatments, molecules, and specific compounds 
described herein are presently representative of preferred 
embodiments, are exemplary, and are not intended as limitations 
on the scope of the invention. Changes therein and other uses will 
occur to those skilled in the art which are encompassed within the 
spirit of the invention as defined by the scope of the claims. 
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WHAT IS CLAIMED IS: 

1. A fusion protein comprising: a protein of interest 
fused at its amino terminus or carboxy terminus to at least one 
affinity peptide, said fusion protein having a formula Rl-(HX n ) m - 
R2, wherein Rl or R2 is said protein of interest, H is histidine, X is 
an amino acid other than histidine, n= 1-8, m= 2-30, and wherein 
if n=l for more than two adjacent units of HX, at least one Xmust 
be asparagine, phenylalanine, tryptophan, tyrosine, lysine, 
methionine, arginine, glutamine, or cysteine. 

2. The fusion protein of claim 1, wherein n=l-4. 

3. The fusion protein of claim 1, wherein m=3-10. 

4. The fusion protein of claim 1, wherein n=l-4 and 

m=3-6. 

5. The fusion protein of claim 1, wherein if n=l for 
more than two adjacent units of HX, only one X is asparagine, 
phenylalanine, tryptophan, tyrosine, lysine, methionine, arginine, 
glutamine, or cysteine. 
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6. The fusion protein of claim 2, wherein said 
fusion protein has a sequence SEQ ID'No. 6. 



5 7. The fusion protein of claim 1, wherein said 

fusion protein contains at least one protease cleavage site between 
said protein of interest and said affinity peptide. 

10 8. The fusion protein of claim 1, wherein said 

affinity peptide has affinity for metal ions. 

9. The fusion protein of claim 8, wherein said metal 
15 ions are nickel ions. 

10. The fusion protein of claim 1, wherein said 
fusion protein further comprises a secretion signal sequence. 

20 

11. A DNA sequence coding for a fusion protein 
comprising a protein of interest fused at its amino- or carboxy- 
terminus to at least one affinity peptide, said fusion protein 

25 having the formula Rl-(HX n ) m -R2, wherein Rl or R2 is said 
protein of interest, n= 1-8, m= 2-30, and wherein if n=l for more 
than two adjacent units of HX, at least one X must be asparagine, 
phenylalanine, tryptophan, tyrosine, lysine, methionine, arginine, 
glutamine, or cysteine. 
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5 



12. The DNA sequence of claim 11, wherein said 
fusion protein has the sequence shown in SEQ ID No:l. 



13. A recombinant vector comprising a DNA 
sequence of claim 11, wherein said recombinant vector is capable 
of directing expression of said DNA sequence for said fusion 

10 protein in a suitable host organism. 

14. A host organism containing a recombinant vector 
of claim 13, wherein said organism is capable of expressing said 

15 fusion protein. 



15. A method for purifying the fusion protein of 
claim 1, comprising the steps of: 
20 contacting a protein sample containing said fusion 

protein in a mixture with other proteins with a metal chelate resin 
under conditions where said fusion protein binds to said resin to 
produce a resin-fusion protein complex; 

washing said resin-fusion protein complex with a 
25 buffer to remove said other, unbound proteins; and 

eluting said bound fusion protein from the washed 
resin-fusion protein complex; wherein said eluted fusion protein is 
purified. 
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16. The method of claim 15, further comprising the 
step of cleaving said protein of interest from said affinity peptide. 



17. The method of claim 16, further comprising the 
step of separating said cleaved protein of interest from said 
affinity peptide using a. metal chelate resin under conditions 
where said affinity peptide binds to said metal of said resin and 
10 said protein of interest does not. 
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SEQUENCE LISTING 

<110> Tchaga, Grigoriy 

Jokhadze, George G. 
<120> Compositions and Methods for Protein Purification 

Based on a Novel Metal Ion Affinity Site 

<130> D6094PCT 
<140> 

<141> 1999-05-14 

<150> US 09/078,687 

<151> 1998-05-14 

<160> 6 



<210> 
<211> 
<212> 
<213> 
<220> 
<223> 



<400> 
atgaccatga 
tgtccacaaa 
gtgtgggtgc 
gttgtcccaa 
ttctgtcagt 
ttaaatttat 
gtcactactt 
tatgaaacgg 
aggaacgcac 
gaagtcaagt 
tattgatttt 
actataactc 
atcaaagcta 
actagcagac 
ttt'taccaga 
cccaacgaaa 
tgggattaca 



1 

840 
DNA 

artificial sequence 

Partial cDNA sequence of a vector containing the 
affinity peptide at the N-terminus of Green 
Fluorescent Protein-UV mutant (GFPuv) 
1 

ttacgccaag cttgtctctc aaggatcatc tcatccacaa 50 
gaggagcacg ctcatgccca caacaagatc agcgtggttg 100 
agttggaccg gtaagtaaag gagaagaact tttcactgga 15 0 
ttcttgttga attagatggt gatgttaatg ggcacaaatt 200 
ggagagggtg aaggtgatgc aacatacgga aaacttaccc 250 
ttgcactact ggaaaactac ctgttccatg gccaacactt 300 
tctcttatgg tgttcaatgc ttttcccgtt atccggatca 350 
catgactttt tcaagagtgc catgcccgaa ggttatgtac 400 
tatatctttc aaagatgacg ggaactacaa gacgcgtgct 450 
ttgaaggtga tacccttgtt. aatcgtatcg agttaaaagg 500 
aaagaagatg gaaacattct cggacacaaa ctcgagtaca 550 
acacaatgta tacatcacgg cagacaaaca aaagaatgga 600 
acttcaaaat tcgccacaac attgaagatg gatccgttca 650 
cattatcaac aaaatactcc aattggcgat ggccctgtcc 700 
caaccattac ctgtcgacac aatctgccct ttcgaaagat 750 
agcgtgacca catggtcctt cttgagtttg taactgctgc 800 
catggcatgg atgagctcta caaataatga 840 
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<210> 


2 


<211> 


278 


<212> 


PRT 


<213> 


artificial : 


<220> 




<223> 


Amino acid i 




of a vector 




N- terminus < 


<400> 


2 



Met Thr Met He Thr Pro Ser Leu Ser Leu Lys Asp His Leu He 

5 10 15 

His Asn Val His Lys Glu Glu His Ala His Ala His Asn Lys He 

20 25 30 

Ser Val Val Gly Val Gly Ala Val Gly Pro Val Ser Lys Gly Glu 

35 40 45 

Glu Leu Phe Thr Gly Val Val Pro He Leu Val Glu Leu Asp Gly 

50 55 60 

Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu Gly Glu Gly 

65 70 75 

Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe He Cys Thr Thr 

80 85 90 

Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe Ser 

95 100 105 

Tyr Gly Val Gin Cys Phe Ser Arg Tyr Pro Asp His Met Lys Arg 

110 115 120 

His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu 

125 130 135 

Arg Thr He Ser Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala 

140 145 150 

Glu Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu 

155 160 . 165 

lys Gly He Asp Phe Lys Glu Asp Gly Asn He Leu Gly His Lys 

170 175 180 

Leu Glu Tyr Asn Tyr Asn Ser His Asn Val Tyr He Thr Ala Asp 

185 190 1?5 

Lys Gin Lys Asn Gly He Lys Ala Asn Phe Lys He Arg His Asn 

200 205 210 
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He Glu Asp Gly Ser Val Gin Leu Ala Asp His Tyr Gin Gin Asn 

215 220 225 

Thr Pro He Gly Asp Gly Pro Val Leu Leu Pro Asp Asn His Tyr 

230 235 240 

Leu Ser Thr Gin Ser Ala Leu Ser Lys Asp Pro Asn Glu Lys Arg 

245 250 255 

Asp His Met Val Leu Leu Glu Phe Val Thr Ala Ala Gly He Thr 

260 265 270 

His Gly Met Asp Glu Leu Tyr Lys 

275 

<210> 3 

<211> 3384 

<212> DNA 

<213> artificial sequence 
<220> 

<223> cDNA sequence of a vector containing the construct 
of the affinity peptide at the N-terminus of GFPuv 

<400> 3 

agcgcccaat acgcaaaccg cctctccccg cgcgttggcc gattcattaa 50 

tgcagctggc acgacaggtt tcccgactgg aaagcgggca gtgagcgcaa 100 

cgcaattaat gtgagttagc tcactcatta ggcaccccag gctttacact 150 

ttatgcttcc ggctcgtatg ttgtgtggaa ttgtgagcgg ataacaattt 200 

cacacaggaa acagctatga ccatgattac gccaagcttg tctctcaagg 250 

atcatctcat ccacaatgtc cacaaagagg agcacgctca tgcccacaac 300 

aagatcagcg tggttggtgt gggtgcagtt ggaccggtaa gtaaaggaga 3 50 

agaacttttc actggagttg tcccaattct tgttgaatta gatggtgatg 400 

ttaatgggca caaattttct gtcagtggag agggtgaagg tgatgcaaca 450 

tacggaaaac ttacccttaa atttatttgc actactggaa aactacctgt 500 

tccatggcca acacttgtca ctactttctc ttatggtgtt caatgctttt 550 

cccgttatcc ggatcatatg aaacggcatg actttttcaa gagtgccatg 600 

cccgaaggtt atgtacagga acgcactata tctttcaaag atgacgggaa 650 

ctacaagacg cgtgctgaag tcaagtttga aggtgatacc cttgttaatc 7 00 

gtatcgagtt aaaaggtatt gattttaaag aagatggaaa cattctcgga 7 50 

cacaaactcg agtacaacta taactcacac aatgtataca tcacggcaga 800 

caaacaaaag aatggaatca aagctaactt caaaattcgc cacaacattg 850 

aagatggatc cgttcaacta gcagaccatt atcaacaaaa tactccaatt 900 

ggcgatggcc ctgtcctttt accagacaac cattacctgt cgacacaatc 950 
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tgccctttcg aaagatccca acgaaaagcg tgaccacatg gtccttcttg 1000 
agtttgtaac tgctgctggg attacacatg gcatggatga gctctacaaa 1050 
taatgaattc caactgagcg ccggtcgcta ccattaccaa cttgtctggt 1100 
gtcaaaaata ataggcctac tagtcggccg tacgggccct ttcgtctcgc 1150 
gcgtttcggt gatgacggtg aaaacctctg acacatgcag ctcccggaga 1200 
cggtcacagc ttgtctgtaa gcggatgccg ggagcagaca agcccgtcag 1250 
ggcgcgtcag cgggtgttgg cgggtgtcgg ggctggctta actatgcggc 1300 
atcagagcag attgtactga gagtgcacca tatgcggtgt gaaataccgc 1350 
acagatgcgt aaggagaaaa taccgcatca ggcggcctta agggcctcgt 1400 
gatacgccta tttttatagg ttaatgtcat gataataatg gtttcttaga 1450 
cgtcaggtgg cacttttcgg ggaaatgtgc gcggaacccc tatttgttta 1500 
tttttctaaa tacattcaaa tatgtatccg ctcatgagac aataaccctg 1550 
ataaatgctt caataatatt gaaaaaggaa gagtatgagt attcaacatt 1600 
tccgtgtcgc ccttattccc ttttttgcgg cattttgcct tcctgttttt 1650 
gctcacccag aaacgctggt gaaagtaaaa gatgctgaag atcagttggg 1700 
tgcacgagtg ggttacatcg aactggatct caacagcggt aagatccttg 1750 
agagttttcg ccccgaagaa cgttttccaa tgatgagcac ttttaaagtt 1800 
ctgctatgtg gcgcggtatt atcccgtatt gacgccgggc aagagcaact 1850 
cggtcgccgc atacactatt ctcagaatga cttggttgag tactcaccag 1900 
tcacagaaaa gcatcttacg gatggcatga cagtaagaga attatgcagt 1950 
gctgccataa ccatgagtga taacactgcg gccaacttac ttctgacaac 2000 
gatcggagga ccgaaggagc taaccgcttt tttgcacaac atgggggatc 2050 
atgtaactcg ccttgatcgt tgggaaccgg agctgaatga agccatacca 2100 
aacgacgagc gtgacaccac gatgcctgta gcaatggcaa caacgttgcg 2150 
caaactatta actggcgaac tacttactct agcttcccgg caacaattaa 2200 
tagactggat ggaggcggat aaagttgcag gaccacttct gcgctcggcc 2250 
cttccggctg gctggtttat tgctgataaa tctggagccg gtgagcgtgg 2300 
gtctcgcggt atcattgcag cactggggcc agatggtaag ccctcccgta 2350 
tcgtagttat ctacacgacg gggagtcagg caactatgga tgaacgaaat 2400 
agacagatcg ctgagatagg tgcctcactg attaagcatt ggtaactgtc 2450 
agaccaagtt tactcatata tactttagat tgatttaaaa cttcattttt 2500 
aatttaaaag gatctaggtg aagatccttt ttgataatct catgaccaaa 2550 
atcccttaac gtgagttttc gttccactga gcgtcagacc ccgtagaaaa 2600 
gatcaaagga tcttcttgag atcctttttt tctgcgcgta atctgctgct 2650 
tgcaaacaaa aaaaccaccg ctaccagcgg tggtttgttt gccggatcaa 2700 
gagctaccaa ctctttttcc gaaggtaact ggcttcagca gagcgcagat 2750 
accaaatactgtccttctag tgtagccgta gttaggccac cacttcaaga 2800 
actctgtagc accgcctaca tacctcgctc tgctaatcct gttaccagtg 2850 
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gctgctgcca gtggcgataa gtcgtgtctt accgggttgg actcaagacg 2900 
atagttaccg gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca 2950 
cacagcccag cttggagcga acgacctaca ccgaactgag atacctacag 3000 
cgtgagctat gagaaagcgc cacgcttccc gaagggagaa aggcggacag 3050 
gtatccggta agcggcaggg tcggaacagg agagcgcacg agggagcttc 3100 
cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc 3150 
tgacttgagc gtcgattttt gtgatgctcg tcaggggggc ggagcctatg 3200 
gaaaaacgcc agcaacgcgg cctttttacg gttcctggcc ttttgctggc 3250 
cttttgctca catgttcttt cctgcgttat cccctgattc tgtggataac 3300 
cgtattaccg cctttgagtg agctgatacc gctcgccgca gccgaacgac 3350 
cgagcgcagc gagtcagtga gcgaggaagc ggaa 3384 

<210> 4 
<211> 2754 
<212> DNA 

<213> artificial sequence 

<220> 

<223> cDNA sequence of a vector containing part of the 

affinity peptide at the N-terminus of enterokinase 
cleavage site 

<400> 4 

gacgaaaggg cctcgtgata cgcctatttt tataggttaa tgtcatgata 50 

ataatggttt cttagacgtc aggtggcact tttcggggaa atgtgcgcgg 100 

aacccctatt tgtttatttt tctaaataca ttcaaatatg tatccgctca 150 

tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt 200 

atgagtattc aacatttccg tgtcgccctt attccctttt ttgcggcatt 250 

ttgccttcct gtttttgctc acccagaaac gctggtgaaa gtaaaagatg 300 

ctgaagatca gttgggtgca cgagtgggtt acatcgaact ggatctcaac 350 

agcggtaaga tccttgagag ttttcgcccc gaagaacgtt ttccaatgat 400 

gagcactttt aaagttctgc tatgtggcgc ggtattatcc cgtattgacg 450 

ccgggcaaga gcaactcggt cgccgcatac actattctca gaatgacttg 500 

gttgagtact caccagtcac agaaaagcat cttacggatg gcatgacagt 550 

aagagaatta tgcagtgctg ccataaccat gagtgataac actgcggcca 600 

.acttacttct gacaacgatc ggaggaccga aggagctaac cgcttttttg 650 

cacaacatgg gggatcatgt aactcgcctt gatcgttggg aaccggagct 700 

gaatgaagcc ataccaaacg acgagcgtga caccacgatg cctgtagcaa 750 

tggcaacaac gttgcgcaaa ctattaactg gcgaactact tactctagct 800 



SEQ 5/7 



WO 99/57992 PCT/US99/10662 

tcccggcaac aattaataga ctggatggag gcggataaag ttgcaggacc 850 
acttctgcgc tcggcccttc cggctggctg gtttattgct gataaatctg 900 
gagccggtga gcgtgggtct cgcggtatca ttgcagcact ggggccagat 950 
ggtaagccct cccgtatcgt agttatctac acgacgggga gtcaggcaac 1000 
tatggatgaa cgaaatagac agatcgctga gataggtgcc tcactgatta 1050 
agcattggta actgtcagac caagtttact catatatact ttagattgat 1100 
ttaaaacttc atttttaatt taaaaggatc taggtgaaga tcctttttga 1150 
taatctcatg accaaaatcc cttaacgtga gttttcgttc cactgagcgt 1200 
cagaccccgt agaaaagatc aaaggatctt cttgagatcc tttttttctg 1250 
cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac cagcggtggt 1300 
ttgtttgccg gatcaagagc taccaactct ttttccgaag gtaactggct 1350 
tcagcagagc gcagatacca aatactgtcc ttctagtgta gccgtagtta 1400 
ggccaccact tcaagaactc tgtagcaccg cctacatacc tcgctctgct 1450 
aatcctgtta ccagtggctg ctgccagtgg cgataagtcg tgtcttaccg 1500 
ggttggactc aagacgatag ttaccggata aggcgcagcg gtcgggctga 1550 
acggggggtt cgtgcacaca gcccagcttg gagcgaacga cctacaccga 1600 
actgagatac ctadagcgtg agctatgaga aagcgccacg cttcccgaag 1650 
ggagaaaggc ggacaggtat ccggtaagcg gcagggtcgg aacaggagag 1700 
cgcacgaggg agcttccagg gggaaacgcc tggtatcttt atagtcctgt 1750 
cgggtttcgc cacctctgac ttgagcgtcg atttttgtga tgctcgtcag 1800 
gggggcggag cctatggaaa aacgccagca acgcggcctt tttacggttc 1850 
ctggcctttt gctggccttt tgctcacatg ttctttcctg cgttatcccc 1900 
tgattctgtg gataaccgta ttaccgcctt tgagtgagct gataccgctc 1950 
gccgcagccg aacgaccgag cgcagcgagt cagtgagcga ggaagcggaa 2000 
gagcgcccaa tacgcaaacc gcctctcccc gcgcgttggc cgattcatta 2050 
atgcagctgg cacgacaggt ttcccgactg gaaagcgggc agtgagcgca 2100 
acgcaattaa tgtgagttag ctcactcatt aggcacccca ggctttacac 2150 
tttatgcttc cggctcgtat gttgtgtgga attgtgagcg gataacaatt 2200 
tcacacagga aacagctatg accatgatta cgccaagctt gaaggatcat 2250 
ctcatccaca atgtccacaa agaggagcac gctcatgccc acaacaagat 2300 
cgatgacgat gacaaagtcg acggatcccc gggtaccgag ctcgtaatta 2350 
gctgagaatt cactggccgt cgttttacaa cgtcgtgact gggaaaaccc 2400 
tggcgttacc caacttaatc gccttgcagc acatccccct ttcgccagct 2450 
ggcgtaatag cgaagaggcc cgcaccgatc gcccttccca acagttgcgc 2500 
agcctgaatg gcgaatggcg cctgatgcgg tattttctcc ttacgcatct 2550 
gtgcggtatt tcacaccgca tatggtgcac tctcagtaca atctgctctg 2600 
atgccgcata gttaagccag ccccgacacc cgccaacacc cgctgacgcg 2650 
ccctgacggg cttgtctgct cccggcatcc gcttacagac aagctgtgac 2700 
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cgtctccggg agctgcatgt gtcagaggtt ttcaccgtca tcaccgaaac 2750 

rrr-rrr* 2754 

gcgc 



<210> 
<211> 
<212> 
<213> 
<220> 
<223> 



5 

45 
PRT 

artificial sequence 



Amino acid sequence encoded by cDMA corresponding 
to the start of translation site, the affinity 
peptide and the multiple cloning site 
<400> 5 

Met Thr Met lie Thr Pro Ser Leu Lys Asp His Leu lie His Asn 

5 10 15 

Val His Lys Glu Glu His Ala His Ala His Asn Lys lie Asp Asp 

20 25 30 

Asp Asp Lys Val Asp Gly Ser Pro Gly Thr Glu Leu Val lie Ser 

35 40 45 



<210> 6 
<211> 32 
<212> PRT 

<213> artificial sequence 

<220> 

<223> Amino acid sequence of fusion protein where n=l-4 

for the affinity peptide 
<400> 6 

Ser Leu Lys Asp His Leu He His Asn Val His Lys Glu Glu His 

5 10 15 

Ala His Ala His Asn Lys He Ser Val Val Gly Val Gly Ala Val 

20 25 30 

Gly Met 
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